12 research outputs found

    Turkish Discourse Bank: Porting a discourse annotation style to a morphologically rich language

    Get PDF
    This paper briefly describes the Turkish Discourse Bank, the first publicly available annotated discourse resource for Turkish. It focuses on the challenges posed by annotating Turkish, a free word order language with rich inflectional and derivational morphology. It shows the usefulness of the PDTB style annotation but points out the need to expand this annotation style with the needs of the target language

    The annotation scheme of the Turkish Discourse Bank and an evaluation of inconsistent annotations

    Get PDF
    In this paper, we report on the annotation procedures we developed for annotating the Turkish Discourse Bank (TDB), an effort that extends the Penn Discourse Tree Bank (PDTB) annotation style by using it for annotating Turkish discourse. After a brief introduction to the TDB, we describe the annotation cycle and the annotation scheme we developed, defining which parts of the scheme are an extension of the PDTB and which parts are different. We provide inter-coder reliability calculations on the first and second arguments of some connectives and discuss the most important sources of disagreement among annotators

    The annotation scheme of the Turkish Discourse Bank and an evaluation of inconsistent annotations

    Get PDF
    In this paper, we report on the annotation procedures we developed for annotating the Turkish Discourse Bank (TDB), an effort that extends the Penn Discourse Tree Bank (PDTB) annotation style by using it for annotating Turkish discourse. After a brief introduction to the TDB, we describe the annotation cycle and the annotation scheme we developed, defining which parts of the scheme are an extension of the PDTB and which parts are different. We provide inter-coder reliability calculations on the first and second arguments of some connectives and discuss the most important sources of disagreement among annotators

    Annotating Subordinators in the Turkish Discourse Bank

    Get PDF
    In this paper we explain how we annotated subordinators in the Turkish Discourse Bank (TDB), an effort that started in 2007 and is still continuing. We introduce the project and describe some of the issues that were important in annotating three subordinators, namely kars¸ın, ragmen ˘ and halde, all of which encode the coherence relation Contrast-Concession. We also describe the annotation tool

    ODTÜ metin düzeyinde işaretlenmiş derlemin değerlendirmesi ve Türkçede deyimsel ifadelerin otomatik belirlenmesi için kademeli bir model.

    No full text
    This thesis presents a methodology for an overall assessment of the Turkish Discourse Bank (TDB), a linguistic resource where discourse relations overtly expressed by discourse connectives have been identified and annotated with the two arguments they relate. We provide a quantitative and qualitative assessment of the TDB in order to establish the reliability of this discourse resource for Turkish and suggest that our methodology can be utilized for reliability evaluations of other annotated corpora. Our quantitative evaluation consists of calculating in depth statistical measures using the Kappa statistic and extra evaluators originally used in evaluating information retrieval systems. A two-way methodology for calculating the agreement statistics is proposed: a Common Arguments approach and an Overall approach. Although the Overall approach is e ective on its own, we propose a comparison of these two approaches, which enables to pin point sources of disagreements more accurately. As part of our qualitative evaluation we present a novel effort to automatically identify discursive uses of phrasal expressions that have been annotated systematically alongside explicit discourse connectives in the TDB, given any Turkish text. Our cascaded model, achieves full recall, provides 99.95% accuracy, and can be utilized to effortlessly enlarge the coverage of the TDB.Ph.D. - Doctoral Progra

    ODTÜ Metin Düzeyinde İşaretlenmiş Derlem Projesi Tanıtımı

    Get PDF
    ODTÜ Metin Düzeyinde ĠĢaretlenmiĢ Derlem (ODTÜ-MEDĠD) Projesi, 2007 Yılında baĢladığımız bir projedir 1 . Metin türü, yazar, yayın yılı gibi bilgilerin iĢaretlenmiĢ olduğu bir kaynak olan ODTÜ Türkçe Derlem’in (Say ve diğ. 2002) metin düzeyinde bir kaynak haline getirilmesi amaçlanmaktadır. Proje, 1 milyon sözcüklük metin düzeyinde iĢaretlenmiĢ Ġngilizce bir derlem olan PDTB’nin (http://www.seas.upenn.edu/~pdtb) ilkelerini paylaĢmaktadır. Bu yazıda proje tanıtılacak, iĢaretleme ile ilgili kararlarda dilbilimsel gerçeklerin önemi ortaya konacaktır

    Turkish Discourse Bank: Porting a discourse annotation style to a morphologically rich language

    No full text
    This paper briefly describes the Turkish Discourse Bank, the first publicly available annotated discourse resource for Turkish. It focuses on the challenges posed by annotating Turkish, a free word order language with rich inflectional and derivational morphology. It shows the usefulness of the PDTB style annotation but points out the need to expand this annotation style with the needs of the target language
    corecore